A Model for Robust Chinese Parser

نویسنده

  • Keh-Jiann Chen
چکیده

The Chinese language has many special characteristics which are substantially different from western languages, causing conventional methods of language processing to fail on Chinese. For example, Chinese sentences are composed of strings of characters without word boundaries that are marked by spaces. Therefore, word segmentation and unknown word identification techniques must be used in order to identify words in Chinese. In addition, Chinese has very few inflectional or grammatical markers, making purely syntactic approaches to parsing almost impossible. Hence, a unified approach which involves both syntactic and semantic information must be used. Therefore, a lexical feature-based grammar formalism, called Information-based Case Grammar, is adopted for the parsing model proposed here. This grammar formalism stipulates that a lexical entry for a word contains both semantic and syntactic feature structures. By relaxing the constraints on lexical feature structures, even ill-formed input can be accepted, broadening the coverage of the grammar. A model of a priority controlled chart parser is proposed which, in conjunction with a mechanism of dynamic grammar extension, addresses the problems of: (1) syntactic ambiguities, (2) under-specification and limited coverage of grammars, and (3) ill-formed sentences. The model does this without causing inefficient parsing of sentences that do not require relaxation of constraints or dynamic extension of the grammar.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Block-Based Robust Dependency Parser for Unrestricted Chinese Text1

Although substantial efforts have been made to parse Chinese, very few have been practically used due to incapability of handling unrestricted texts. This paper realizes a practical system for Chinese parsing by using a hybrid model of phrase structure partial parsing and dependency parsing. This system showed good performance and high robustness in parsing unrestricted texts and has been appli...

متن کامل

A Block-Based Robust Dependency Parser For Unrestricted Chinese Text

Although substantial efforts have been made to parse Chinese, very few have been practically used due to incapability of handling unrestricted texts. This paper realizes a practical system for Chinese parsing by using a hybrid model of phrase structure partial parsing and dependency parsing. This system showed good performance and high robustness in parsing unrestricted texts and has been appli...

متن کامل

Robust Non-Explicit Neural Discourse Parser in English and Chinese

Neural discourse models proposed so far are very sophisticated and tuned specifically to certain label sets. These are effective, but unwieldy to deploy or repurpose for different label sets or languages. Here, we propose a robust neural classifier for non-explicit discourse relations for both English and Chinese in CoNLL 2016 Shared Task datasets. Our model only requires word vectors and simpl...

متن کامل

Applying Maximum Entropy to Robust Chinese Shallow Parsing

Recently, shallow parsing has been applied to various information processing systems, such as information retrieval, information extraction, question answering, and automatic document summarization. A shallow parser is suitable for online applications, because it is much more efficient and less demanding than a full parser. In this research, we formulate shallow parsing as a sequential tagging ...

متن کامل

Treebank-Based Acquisition of LFG Resources for Chinese

This paper presents a method to automatically acquire wide-coverage, robust, probabilistic Lexical-Functional Grammar resources for Chinese from the Penn Chinese Treebank (CTB). Our starting point is the earlier, proofof-concept work of (Burke et al., 2004) on automatic f-structure annotation, LFG grammar acquisition and parsing for Chinese using the CTB version 2 (CTB2). We substantially exten...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IJCLCLP

دوره 1  شماره 

صفحات  -

تاریخ انتشار 1996